Model Selection

ViT backbone network

# ViT backbone network

Vit Large Patch16 224.orig In21k

A Vision Transformer (ViT) based image classification model, pretrained on ImageNet-21k by Google Research using JAX framework and later ported to PyTorch. Suitable for feature extraction and fine-tuning scenarios.

Image Classification

Vit Base Patch32 224.orig In21k

An image classification model based on Vision Transformer (ViT), pre-trained on ImageNet-21k, suitable for feature extraction and fine-tuning scenarios.

Image Classification

Samvit Huge Patch16.sa1b

Segment-Anything Vision Transformer (SAM ViT) image feature model, containing only feature extraction and fine-tuning capabilities, without the segmentation head.

Image Segmentation

Vit Base Patch14 Dinov2.lvd142m

A Vision Transformer (ViT)-based image feature model, pre-trained using self-supervised DINOv2 method on the LVD-142M dataset

Image Classification

Vit Base Patch16 224.mae

Vision Transformer (ViT) based image feature extraction model, pre-trained on ImageNet-1k dataset using self-supervised masked autoencoder (MAE) method

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase